This paper investigates the use of artificial neural networks (ANNs) to solve differential equations (DEs) and the construction of the loss function which meets both differential equation and its initial/boundary condition of a certain DE. In section 2, the loss function is generalized to $n^\text{th}$ order ordinary differential equation(ODE). Other methods of construction are examined in Section 3 and applied to three different models to assess their effectiveness.
translated by 谷歌翻译
Because of the widespread existence of noise and data corruption, recovering the true regression parameters with a certain proportion of corrupted response variables is an essential task. Methods to overcome this problem often involve robust least-squares regression, but few methods perform well when confronted with severe adaptive adversarial attacks. In many applications, prior knowledge is often available from historical data or engineering experience, and by incorporating prior information into a robust regression method, we develop an effective robust regression method that can resist adaptive adversarial attacks. First, we propose the novel TRIP (hard Thresholding approach to Robust regression with sImple Prior) algorithm, which improves the breakdown point when facing adaptive adversarial attacks. Then, to improve the robustness and reduce the estimation error caused by the inclusion of priors, we use the idea of Bayesian reweighting to construct the more robust BRHT (robust Bayesian Reweighting regression via Hard Thresholding) algorithm. We prove the theoretical convergence of the proposed algorithms under mild conditions, and extensive experiments show that under different types of dataset attacks, our algorithms outperform other benchmark ones. Finally, we apply our methods to a data-recovery problem in a real-world application involving a space solar array, demonstrating their good applicability.
translated by 谷歌翻译
Causal chain reasoning (CCR) is an essential ability for many decision-making AI systems, which requires the model to build reliable causal chains by connecting causal pairs. However, CCR suffers from two main transitive problems: threshold effect and scene drift. In other words, the causal pairs to be spliced may have a conflicting threshold boundary or scenario. To address these issues, we propose a novel Reliable Causal chain reasoning framework~(ReCo), which introduces exogenous variables to represent the threshold and scene factors of each causal pair within the causal chain, and estimates the threshold and scene contradictions across exogenous variables via structural causal recurrent neural networks~(SRNN). Experiments show that ReCo outperforms a series of strong baselines on both Chinese and English CCR datasets. Moreover, by injecting reliable causal chain knowledge distilled by ReCo, BERT can achieve better performances on four downstream causal-related tasks than BERT models enhanced by other kinds of knowledge.
translated by 谷歌翻译
ClueWeb22, the newest iteration of the ClueWeb line of datasets, provides 10 billion web pages affiliated with rich information. Its design was influenced by the need for a high quality, large scale web corpus to support a range of academic and industry research, for example, in information systems, retrieval-augmented AI systems, and model pretraining. Compared with earlier ClueWeb corpora, the ClueWeb22 corpus is larger, more varied, of higher-quality, and aligned with the document distributions in commercial web search. Besides raw HTML, ClueWeb22 includes rich information about the web pages provided by industry-standard document understanding systems, including the visual representation of pages rendered by a web browser, parsed HTML structure information from a neural network parser, and pre-processed cleaned document text to lower the barrier to entry. Many of these signals have been widely used in industry but are available to the research community for the first time at this scale.
translated by 谷歌翻译
Earth observation, aiming at monitoring the state of planet Earth using remote sensing data, is critical for improving our daily lives and living environment. With a growing number of satellites in orbit, an increasing number of datasets with diverse sensors and research domains are being published to facilitate the research of the remote sensing community. In this paper, we present a comprehensive review of more than 400 publicly published datasets, including applications like land use/cover, change/disaster monitoring, scene understanding, agriculture, climate change, and weather forecasting. We systematically analyze these Earth observation datasets with respect to five aspects volume, bibliometric analysis, resolution distributions, research domains, and the correlation between datasets. Based on the dataset attributes, we propose to measure, rank, and select datasets to build a new benchmark for model evaluation. Furthermore, a new platform for Earth observation, termed EarthNets, is released as a means of achieving a fair and consistent evaluation of deep learning methods on remote sensing data. EarthNets supports standard dataset libraries and cutting-edge deep learning models to bridge the gap between the remote sensing and machine learning communities. Based on this platform, extensive deep learning methods are evaluated on the new benchmark. The insightful results are beneficial to future research. The platform and dataset collections are publicly available at https://earthnets.github.io/.
translated by 谷歌翻译
The peer merit review of research proposals has been the major mechanism for deciding grant awards. However, research proposals have become increasingly interdisciplinary. It has been a longstanding challenge to assign interdisciplinary proposals to appropriate reviewers, so proposals are fairly evaluated. One of the critical steps in reviewer assignment is to generate accurate interdisciplinary topic labels for proposal-reviewer matching. Existing systems mainly collect topic labels manually generated by principal investigators. However, such human-reported labels can be non-accurate, incomplete, labor intensive, and time costly. What role can AI play in developing a fair and precise proposal reviewer assignment system? In this study, we collaborate with the National Science Foundation of China to address the task of automated interdisciplinary topic path detection. For this purpose, we develop a deep Hierarchical Interdisciplinary Research Proposal Classification Network (HIRPCN). Specifically, we first propose a hierarchical transformer to extract the textual semantic information of proposals. We then design an interdisciplinary graph and leverage GNNs for learning representations of each discipline in order to extract interdisciplinary knowledge. After extracting the semantic and interdisciplinary knowledge, we design a level-wise prediction component to fuse the two types of knowledge representations and detect interdisciplinary topic paths for each proposal. We conduct extensive experiments and expert evaluations on three real-world datasets to demonstrate the effectiveness of our proposed model.
translated by 谷歌翻译
功能转换旨在通过数学转换现有功能来提取良好的表示(功能)空间。应对维度的诅咒,增强模型概括,克服数据稀疏性并扩大经典模型的可用性至关重要。当前的研究重点是基于领域的知识特征工程或学习潜在表示;然而,这些方法并非完全自动化,不能产生可追溯和最佳的表示空间。在重建机器学习任务的功能空间时,可以同时解决这些限制吗?在这项扩展研究中,我们提出了一个用于特征转化的自优化框架。为了取得更好的性能,我们通过(1)获得高级状态表示来改善初步工作,以使加强代理能够更好地理解当前功能集; (2)解决Q值高估的Q值高估,以学习无偏见和有效的政策。最后,为了使实验比初步工作更具说服力,我们结论是通过五个数据集添加异常检测任务,评估各种状态表示方法,并比较不同的培训策略。广泛的实验和案例研究表明,我们的工作更有效和更高。
translated by 谷歌翻译
自动语音识别模型需要大量的语音数据进行培训,并且此类数据的收集通常会导致隐私问题。联合学习已被广泛使用,被认为是一种有效的分散技术,通过协作学习共享的预测模型,同时将数据保留在不同客户端设备上。但是,客户设备上有限的计算和通信资源给大型模型带来了实际困难。为了克服此类挑战,我们建议联合修剪以在联合环境下训练还原模型,同时与完整模型相比保持相似的性能。此外,与集中式培训相比,还可以利用大量客户数据来改善修剪结果。我们探索不同的修剪方案,并提供了我们方法有效性的经验证据。
translated by 谷歌翻译
本文介绍了使用变压器基于目标扬声器语音活动检测(TS-VAD)的扬声器诊断模型。为了克服原始的TS-VAD模型无法处理任意数量的扬声器的缺点,我们研究了使用具有可变长度时间和扬声器尺寸的输入张量的模型架构。将变压器层应用于扬声器轴,以使模型输出对提供给TS-VAD模型的扬声器配置文件的顺序不敏感。时间顺序层插入了这些说话者的变压器层之间,以允许捕获输入语音信号的时间和跨语言器相关性。我们还使用基于编码器的吸引子(EEND-EDA)将基于端到端神经诊断的诊断模型通过基于变压器的TS-VAD替换其基于DOT的扬声器检测层,从而扩展了基于端到端的神经腹泻。 VoxConverse上的实验结果表明,使用变压器进行跨言扬声器建模可将TS-VAD的诊断错误率(DER)降低10.9%,从而使新的最先进(SOTA)DER达到4.74%。此外,我们的扩展eDa-eda在呼叫者数据集上相对于原始eend-eda的模型大小将6.9%降低了6.9%,在广泛使用的培训数据设置下,新的SOTA DER为11.18%。
translated by 谷歌翻译
具有很少带注释的样本的训练语义分割模型在各种现实世界中具有巨大的潜力。对于少数拍摄的分段任务,主要的挑战是如何准确地测量使用有限的培训数据之间的支持样本和查询样品之间的语义对应关系。为了解决这个问题,我们建议用可变形的4D变压器汇总可学习的协方差矩阵,以有效预测分割图。具体而言,在这项工作中,我们首先设计了一种新颖的艰难示例挖掘机制,以学习高斯过程的协方差内核。在对应测量中,学到的协方差内核函数比现有基于余弦相似性的方法具有很大的优势。基于学到的协方差内核,设计有效的双重变形4D变压器模块旨在适应骨料特征相似性图中的分割结果。通过组合这两种设计,提出的方法不仅可以在公共基准测试上设置新的最新性能,而且比现有方法更快地收敛。三个公共数据集的实验证明了我们方法的有效性。
translated by 谷歌翻译